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Abstract 

Many researchers in artificial intelligence are beginning to explore the use of soft con- 
straints to express a set of (possibly confiicting) problem requirements. A soft constraint is 
a function defined on a collection of variables which associates some measure of desirability 
with each possible combination of values for those variables. However, the crucial question 
of the computational complexity of finding the optimal solution to a collection of soft con- 
straints has so far received very little attention. In this paper we identify a class of soft 
binary constraints for which the problem of finding the optimal solution is tractable. In 
other words, we show that for any given set of such constraints, there exists a polynomial 
time algorithm to determine the assignment having the best overall combined measure of 
desirability. This tractable class includes many commonly-occurring soft constraints, such 
as "as near as possible" or "as soon as possible after" , as well as crisp constraints such as 
"greater than". Finally, we show that this tractable class is maximal, in the sense that 
adding any other form of soft binary constraint which is not in the class gives rise to a class 
of problems which is NP-hard. 

1. Introduction 

The constraint satisfaction framework is widely acknowledged as a convenient and efficient 
way to model and solve a wide variety of problems arising in Artificial Intelligence, including 
planning (Kautz &: Selman, 1992) and scheduling (van Beek, 1992), image processing (Mon- 
tanari, 1974) and natural language understanding (Allen, 1995). 

In the standard framework a constraint is usually taken to be a predicate, or relation, 
specifying the allowed combinations of values for some fixed collection of variables: we will 
refer to such constraints here as crisp constraints. A number of authors have suggested 
that the usefulness of the constraint satisfaction framework could be greatly enhanced by 
extending the definition of a constraint to include also soft constraints, which allow different 
measures of desirability to be associated with different combinations of values (Bistarelli 
et al., 1997, 1999). In this extended framework a constraint can be seen as a function, 
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mapping each possible combination of values to a measure of desirability or undesirability. 
Finding a solution to a set of constraints then means finding an assignment of values to all 
of the variables which has the best overall combined desirability measure. 

Example 1.1 Consider an optimization problem with 2n variables, vi,V2,- ■ ■ ,V2m where 
we wish to assign each variable an integer value in the range 1,2, ...,n, subject to the 
following restrictions: 

• Each variable Vi should be assigned a value that is as close as possible to i/2. 

• Each pair of variables Vi , V2i should be assigned a pair of values that are as close as 
possible to each other. 

To model this situation we might impose the following soft constraints: 

• A unary constraint on each Uj specified by a function ipi, 
where ipi{x) = {x — i/2)^. 

• A binary constraint on each pair Vi,V2i specified by a function 5r, 
where Sr(x,y) = \x — y|'' for some r > 1. 

We would then seek an assignment to all of the variables which minimizes the sum of all of 
these constraint functions, 

2n n 

'^1pi{vi) +y^^Sr{vi,V2i)- 
i=l 1=1 

n 

The cost of allowing additional flexibility in the specification of constraints, in order to 
model requirements of this kind, is generally an increase in computational difficulty. In 
the case of crisp constraints there has been considerable progress in identifying classes of 
constraints which are tractable, in the sense that there exists a polynomial time algorithm 
to determine whether or not any collection of constraints from such a class can be simul- 
taneously satisfied (Bulatov, 2003; Feder &: Vardi, 1998; Jeavons et al., 1997). In the case 
of soft constraints there has been a detailed investigation of the tractable cases for Boolean 
problems (where each variable has just 2 possible values) (Creignou et al., 2001), but very 
little investigation of the tractable cases over larger finite domains, even though there are 
many significant results in the literature on combinatorial optimization which are clearly 
relevant to this question (Nemhauser &: Wolsey, 1988). 

The only previous work we have been able to find on the complexity of non-Boolean 
soft constraints is a paper by Khatib et al. (2001), which describes a family of tractable soft 
temporal constraints. However, the framework for soft constraints used by Khatib et al. 
(2001) is different from the one we use here, and the results are not directly comparable. 
We discuss the relationship between this earlier work and ours more fully in Section 5. 

In this paper we make use of the idea of a submodular function (Nemhauser &: Wolsey, 
1988) to identify a general class of soft constraints for which there exists a polynomial time 
solution algorithm. Submodular functions are widely used in economics and operational 
research (Fujishige, 1991; Nemhauser &: Wolsey, 1988; Topkis, 1998), and the notion of 
submodularity provides a kind of discrete analogue of convexity (Lovasz, 1983). 
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Submodular functions are usually defined (Nemhauser h Wolsey, 1988) as real-valued 
functions on sets (which may be viewed as Boolean tuples), but we consider here the more 
general case of functions on tuples over an arbitrary finite domain (as in Topkis, 1978). We 
also allow our functions to take infinite values. By establishing a new decomposition result 
for binary submodular functions of this kind, we obtain a cubic time algorithm to find the 
optimal assignment for any set of soft constraints which can be defined using them (such as 
the constraints in Example 1.1). Because our algorithm is specially devised for submodular 
functions that are expressed as a combination of binary functions, it is much more efficient in 
this case than existing general algorithms for submodular function minimization (Schrijver, 
2000; Iwataet al., 2001). 

We give a number of examples to illustrate the many different forms of soft constraint 
that can be defined using binary submodular functions, and we also show that this class 
is maximal, in the sense that no other form of binary constraint can be added to the class 
without sacrificing tractability. 

2. Definitions 

To identify a tractable class of soft constraints we will need to restrict the set of functions 
that are used to specify constraints. Such a restricted set of possible functions will be called 
a soft constraint language. 

Definition 2.1 Let D and E he fixed sets. A soft constraint language over D with evalu- 
ations in E is defined to be a set of functions, F, such that each (p & F is a function from 
D to E, for some A; G N, where k is called the arity of cj). 

For any given choice of soft constraint language, F, we define an associated soft constraint 
satisfaction problem, which we will call sCSP(F), as follows. 

Definition 2.2 Let T he a soft constraint language over D with evaluations in E. An 

instance V o/sCSP(F) is a triple {V,D,C), where: 

• V is a finite set o/ variables, which must he assigned values from the set D. 

• C is a set o/soft constraints. Each c ^ C is a pair (cr, (f)) where: a is a list of variahles, 
of length \g\, called the scope of c; and (p is an element of T of arity \a\, called the 
evaluation function of c. 

The evaluation function cf) will be used to specify some measure of desirability or undesir- 
ability associated with each possible tuple of values over a. 

To complete the definition of a soft constraint satisfaction problem we need to define how 
the evaluations obtained from each evaluation function are combined and compared, in order 
to define what constitutes an optimal overall solution. Several alternative mathematical 
approaches to this issue have been suggested in the literature: 

• In the semiring based approach (Bistarelli et al., 1997, 1999), the set of possible 
evaluations, E, is assumed to be an algebraic structure equipped with two binary 
operations, satisfying the axioms of a semiring. One example of such a structure is 
the real interval [0, 1], equipped with the operations min and max, which corresponds 
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to the conjunctive fuzzy CSP framework (Rosenfeld et al., 1976; Ruttkay, 1994). 
Another example is the set {0, 1, 2, . . .} U {oo}, equipped with the operations max and 
plus, which corresponds to the weighted CSP framework (BistareUi et al., 1999). 

• In the valued CSP approach (BistareUi et al., 1999), the set of possible evaluations E 
is assumed to be a totally ordered algebraic structure with a top and bottom element 
and a single monotonic binary operation known as aggregation. One example of such 
a structure is the set of multisets over some finite ordered set together with a top 
element, equipped with the operation of multiset union, which corresponds to the 
lexicographic CSP framework (BistareUi et al., 1999). 

For our purposes, we require the same properties as the valued CSP approach, with the 
additional requirement that the aggregation operation has a partial inverse, such that eval- 
uations other than the top element may be "cancelled" when occurring on both sides of an 
inequality. For simplicity, we shall assume throughout this paper that the set of evaluations 
E is either the set of non-negative integers together with infinity, or else the set of non- 
negative real numbers together with infinity^. Hence, throughout this paper the bottom 
element in the evaluation structure is 0, the top element is oo, and for any two evaluations 
/?i,/?2 G E, the aggregation of pi and p2 is given by pi + p2 G E. Moreover, when pi > p2 
we also have pi — P2 ^ E. (Note that we set oo — oo = oo). 

The elements of the set E are used to represent different measure of undesirability, or 
penalties, associated with different combinations of values. This allows us to complete the 
definition of a soft constraint satisfaction problem with the following simple definition of a 
solution to an instance. 

Definition 2.3 For any soft constraint satisfaction problem instance V = {V,D,C), an 
assignment for V is a mapping t from V to D. The evaluation of an assignment t, denoted 
$-p(i), is given by the sum (i.e., aggregation) of the evaluations for the restrictions of t 
onto each constraint scope, that is, 

^'r(t)= Yl (Kt{vi),t{v2),...,t{vk)). 

{{vi,V2,...,Vk)A)ec 

A solution to V is an assignment with the smallest possible evaluation, and the question is 
to find a solution. 

Example 2.4 For any standard constraint satisfaction problem instance V with crisp con- 
straints, we can define a corresponding soft constraint satisfaction problem instance V in 
which the range of the evaluation functions of all the constraints is the set {0, oo}. For each 
crisp constraint c of "P, we define a corresponding soft constraint c of "P with the same scope; 
the evaluation function of c maps each tuple allowed by c to 0, and each tuple disallowed 
by c to oo. 

In this case the evaluation of an assignment t for V equals the minimal possible evalu- 
ation, 0, if and only if t satisfies all of the crisp constraints in V. D 



1. Many of our results can be extended to more general evaluation structures, such as the strictly monotonic 
structures described by Cooper (2003), but we will not pursue this idea here. 
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Example 2.5 For any standard constraint satisfaction problem instance V with crisp con- 
straints, we can define a corresponding soft constraint satisfaction problem instance V^ in 
which the range of the evaluation functions of all the constraints is the set {0, 1}. For each 
crisp constraint c of V, we define a corresponding soft constraint c* of P* with the same 
scope; the evaluation function of c* maps each tuple allowed by c to 0, and each tuple 
disallowed by c to 1. 

In this case the evaluation of an assignment t for 7^* equals the number of crisp con- 
straints in V which are violated by t. Hence a solution to P* corresponds to an assignment 
which violates the minimal number of constraints of "P, and hence satisfies the maximal 
number of constraints of V. Finding assignments of this kind is generally referred to as 
solving the Max-CSP problem (Freuder &: Wallace, 1992; Larrosa et al., 1999). D 

Note that the problem of finding a solution to a soft constraint satisfaction problem is an 
NP optimization problem, that is, it lies in the complexity class NPO (see Creignou et al., 
2001 for a formal definition of this class). If there exists a polynomial-time algorithm which 
finds a solution to all instances of sCSP(r), then we shall say that sCSP(r) is tractable. On 
the other hand, if there is a polynomial-time reduction from some NP-complete problem to 
sCSP(r), then we shall say that sCSP(r) is NP-hard. 

Example 2.6 Let F be a soft constraint language over D, where \D\ = 2. In this case 
sCSP(r) is a class of Boolean soft constraint satisfaction problems. 

If we restrict F even further, by only allowing functions with range {0, oo}, as in Ex- 
ample 2.4, then sCSP(r) corresponds precisely to a standard Boolean crisp constraint sat- 
isfaction problem. Such problems are sometimes known as Generalized Satisfiabil- 
ity problems (Schaefer, 1978). The complexity of sCSP(r) for such restricted sets F has 
been completely characterised, and it has been shown that there are precisely six tractable 
cases (Schaefer, 1978; Creignou et al., 2001). 

Alternatively, if we restrict F by only allowing functions with range {0, 1}, as in Exam- 
ple 2.5, then sCSP(r) corresponds precisely to a standard Boolean maximum satisfiability 
problem, in which the aim is to satisfy the maximum number of crisp constraints. Such 
problems are sometimes known as Max-Sat problems (Creignou et al., 2001). The com- 
plexity of sCSP(F) for such restricted sets F has been completely characterised, and it has 
been shown that there are precisely three tractable cases (see Theorem 7.6 of Creignou 
et al., 2001). 

We note, in particular, that when F contains just the single binary function (j)xoR 
defined by 

, , ^ f if a; ^ y 

4>xoR{x,y) = ^ 1 otherwise 

then sCSP(F) corresponds to the Max-Sat problem for the exclusive-or predicate, which 
is known to be NP-hard (see Lemma 7.4 of Creignou et al., 2001). D 

Example 2.7 Let F be a soft constraint language over Z) = {1, 2, . . . , M}, where M > 3, 
and assume that F contains just the set of all unary functions, together with the single 
binary function (J)eq defined by 



^EQ{x,y) 



\i X = y 

1 otherwise. 
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Even in this very simple case it can be shown that sCSP(r) is NP-hard, by reduction 
from the MINIMUM 3-TERMlNAL Cut problem (Dahlhaus et al., 1994). An instance of this 
problem consists of an undirected graph (F, E) in which each edge e & E has an associated 
weight, together with a set of distinguished vertices, {vi,V2,vs} C V, known as terminals. 
The problem is to find a set of edges with the smallest possible total weight whose removal 
disconnects each possible pair of terminals. Such a set is known as a minimum 3-terminal 
cut. 

To obtain the reduction to sCSP(r), let I be an instance of Minimum 3-Terminal Cut 
consisting of the graph {V,E) with terminals {vi,V2,vs}. We construct a corresponding 
instance Vj of sCSP(r) as follows. The variables oiVj correspond to the set of vertices V. 
For each edge {vi,Vj} G E, add a binary soft constraint with scope {vi,Vj) and evaluation 
function (pEQ, as above. Finally, for each terminal Vi G {vi,V2,vs}, add a unary constraint 
on the variable Vi with evaluation function ipi, defined as follows: 

, . _ J ii X = i 

Vi[x) - I \E\ + 1 otherwise 

It is straightforward to check that the number of edges in a minimum 3-terminal cut of I 
is equal to the evaluation of a solution to "P/. D 

The examples above indicate that generalizing the constraint satisfaction framework to in- 
clude soft constraints does indeed increase the computational complexity, in general. For 
example, the standard 2-Satisfiability problem is tractable, but the soft constraint sat- 
isfaction problem involving only the single binary Boolean function, (j)xoR^ defined at the 
end of Example 2.6, is NP-hard. Similarly, the standard constraint satisfaction problem 
involving only crisp unary constraints and equality constraints is clearly trivial, but the soft 
constraint satisfaction problem involving only soft unary constraints and a soft version of 
the equality constraint, specified by the function ^eq defined in Example 2.7, is NP-hard. 
However, in the next two sections we will show that it is possible to identify a large class 
of functions for which the corresponding soft constraint satisfaction problem is tractable. 

3. Generalized Interval Functions 

We begin with a rather restricted class of binary functions, with a very special structure. 

Definition 3.1 Let D he a totally ordered set. A binary function, (p : D^ — )■ E will be called 
a generalized interval function on D if it has the following form: 

if {x < a)y {y > b); 



4>\x.y) , ,, 

' /? otherwise 

for some a,b & D and some p E E. Such a function will be denoted t]^ .i. 

We can explain the choice of name for these functions by considering the unary function 
Tyf* ,Ax^x). This function returns the value p if and only if its argument lies in the interval 
[a, 6]; outside of this interval it returns the value 0. 

We shall write Vqi to denote the set of all generalized interval functions on _D, where 
D = {1, 2, . . . , M} with the usual ordering. 
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Figure 1: The table of values for the function jjf ^■, 



Note that the table of values for any function 77^ „ G Fgi can be written as an M x M 
matrix in which all the entries are 0, except for the rectangular region lying between posi- 
tions (a, 1) and (M.b), where the entries have value /?, as illustrated in Figure 1. Hence when 
/? = 00, a soft constraint with evaluation function rj^ ,-, is equivalent to a crisp constraint 
which is a particular form of connected row-convex constraint (Deville et al., 1999). 

The main result of this section is Corollary 3.6, which states that sCSP(rG'/) is tractable. 
To establish this result we first define a weighted directed graph^ associated with each 
instance of sCSP(rG</) (see Figure 2). 

Definition 3.2 Let V = (F, {1, . . . , M}, C) be an instance o/sCSP(rG/). We define the 
weighted directed graph G-p as follows. 

• The vertices of G-p are as follows: {S, T} Li {v^, \ v & V, d G {0, 1, . . . , M}}. 

• The edges of G-p are defined as follows: 

— For each v & V, there is an edge from S to vm with weight 00; 

— For each v & V, there is an edge from vq to T with weight 00; 

— For each v ^ V and each d G {1, 2, . . . , M — 2}, there is an edge from v^ to v^^i 
with weight 00; 

— For each constraint {{v,w),rif^-.) G C, there is an edge from wt, to Va-i with 



weight p. These edges are called "constraint edges" 



2. This construction was inspired by a similar construction for certain Boolean constraints described 
by Khanna et al. (2000). 
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Figure 2: The graph G-p associated with the instance V defined in Example 3.3. 
(Note that soHd arrows indicate edges with infinite weight.) 



Example 3.3 Let V = {{x,y,z},{l,2,3,4},C) be an instance of sCSP(rG/) with the 
following four constraints: 

The corresponding weighted directed graph G-p, is shown in Figure 2. D 

Any set of edges C in the graph Gp whose removal leaves the vertices S and T disconnected 
will be called a cut. If C is a minimal set of edges with this property, in the sense that 
removing any edge from C leaves a set of edges which is not a cut, then C will called a 
minimal cut. If every edge in C is a constraint edge, then C will be called a proper cut. 
The weight of a cut C is defined to be the sum of the weights of all the edges in C. 

Example 3.4 Consider the graph G-p shown in Figure 2. The set {{ys, ^o)} is a proper cut 
in Gp with weight 7, which is minimal in the sense defined above. The set {{3:4, ^2), {^3? ya)} 
is also a proper cut in G-p with weight 5, which is again minimal in the sense defined above. 

D 
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Proposition 3.5 Let V he any instance of sCSP(rG/), and let Gp be the associated 
weighted directed graph, as specified in Definition 3.2. 

1. For each minimal proper cut in G-p with weight $, there is an assignment for V with 
evaluation $. 

2. For each assignment t for V with evaluation $, there is a proper cut in G-p with 
weight $. 

Proof: 

1. Let C be any minimal proper cut of the graph Gp, and let Cs be the component of 
G-p \ C connected to S. Since C is proper, Cs always contains vm, and never contains 
vq, so we can define the assignment tc as follows: 

tc{v) = mm{d I Vd G Cs} 

By the construction of G-p, it follows that: 

tc{v) > d ^ Vd^Cs (1) 

Now consider any constraint c = {{v , w) , r]^^ ^■,) of "P, and its associated edge e in Gp. 
By Definition 3.1 and Equation 1, rj^ ,-,(tc(v),tc{u!)) = p if and only \iva-i ^ Cs and 
Wh & Cs, and hence if and only if e joms a vertex in Cs to a vertex not in C5. Since 
C is minimal, this happens if and only if e G C Hence, the total weight of the cut C 
is equal to the evaluation of tc. 

2. Conversely, let t be an assignment to "P, and let K be the set of constraints in V with 
a non-zero evaluation on t. 

Now consider any path from 5* to T in Gp. If we examine, in order, the constraint 
edges of this path, and assume that each of the corresponding constraints evaluates 
to 0, then we obtain a sequence of assertions of the following form: 

(w,o > M) V (vi, < ai) 

(vi^ > ^2) V (vi^ < 02) for some 62 > (^1 

(«ifc_i > h) V {vi^ < Ok) for some bk > a^-i 
{vi,^ > bk+i) V {vii^_^^ < 1) for some bk+i > ak 

Since the second disjunct of each assertion contradicts the first disjunct of the next, 
these assertions cannot all hold simultaneously, so one of the corresponding constraints 
must in fact give a non-zero evaluation on t. Hence, every path from 5 to T includes at 
least one edge corresponding to a constraint from K, and so the edges corresponding 
to the set K form a cut in Gp. Furthermore, by the choice of K., the weight of this 
cut is equal to the evaluation of t. 

U 
Hence, by using a standard efficient algorithm for the Minimum Weighted Cut prob- 
lem (Goldberg &: Tarjan, 1988), we can find an optimal assignment in cubic time, as the 
next result indicates. 
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Corollary 3.6 The time complexity o/sCSP(rG/) is 0(n^|Z)|^), where n is the num,her of 
variables. 

Proof: Let V = {V, D, C) be any instance of sCSP(rG'/), and let G-p be the corresponding 
weighted directed graph. If the minimum weight for a cut in G-p \s lo < oo, then it must be 
a proper cut, so V has a solution with evaluation w, by Proposition 3.5. Moreover, if the 
minimum weight for a cut in G-p is oo, then the evaluation of every assignment for V is oo. 

Hence we have established a linear-time reduction from sCSP(rG'/) to the Minimum 
Weighted Cut problem. 

Since Gp has v = iFldi^l + 1) + 2 vertices, and the time complexity of Minimum 
Weighted Cut is 0{v^) (Goldberg k Tarjan, 1988), the result follows. D 



4. Submodular Functions 

In this section we will consider a rather more general and useful class of functions, as 
described by Topkis (1978). 

Definition 4.1 Let D be a totally ordered set. A function, (f) : D'^ -^ E is called a sub- 
modular function on D if, for all (ai, . . . , a/j), {6i, . . . , 6^;) G D^ , we have 

f/>(min(ai, 6i), . . . , mm{ak,bk)) + f/>(max(ai, 6i), . . . , max(aA;, bj,)) 

< (/){ai,...,ak) + (l){bi,...,bk). 

It is easy to check that all unary functions and all generalized interval functions are submod- 
ular. It also follows immediately from Definition 4.1 that the sum of any two submodular 
functions is submodular. This suggests that in some cases it may be possible to express a 
submodular function as a sum of simpler submodular functions. For example, for any unary 
function ip : D ^ E we have 

den 
For binary functions, the definition of submodularity can be expressed in a simplified form, 
as follows. 

Remark 4.2 Let D be a totally ordered set. A binary function, (j) '■ D^ -^ E is submodular 
if and only if, for all u, v,x,y G D, with u < x and v < y, we have: 

4>{u, v) + (l){x, y) < (j){u, y) + (j){x, v) 

Note that when u = x or v = y this inequality holds trivially, so it is sufficient to check only 
those cases where u < x and v < y. 

Example 4.3 Let D be the set {1,2, ...,M} with the usual ordering, and consider the 
binary function ttm, defined by -KMix^y) = M^ — xy. 

For any u, v,x,y G D, with u < x and v < y, we have: 

iTMiu,v) +irM{x,y) = 2M'^-uv-xy 

= 2M^ — uy — XV — {x — u){y — v) 
< TrM{u,y) +TrMix,v). 

Hence, by Remark 4.2, the function ttm is submodular. D 

10 
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A real-valued m x n matrix A with the property that 

Au„ + Axy < Auy + Axvi for alll<f/<a;<m, \ <v < y <n 

is known in operational research as a Monge matrix (for a survey of the properties of such 
matrices and their use in optimization, see Burkard et al., 1996). It is clear from Remark 4.2 
that the table of values for a real-valued binary submodular function is a Monge matrix, 
and conversely, every square Monge matrix can be viewed as a table of values for a binary 
submodular function. 

It was shown by Rudolf and Woeginger (1995) that an arbitrary Monge matrix can be 
decomposed as a sum of simpler matrices. We now obtain a corresponding result for binary 
submodular functions, by showing that any binary submodular function can be decomposed 
as a sum of generalized interval functions. (The result we obtain below is slightly more 
general than the decomposition result for Monge matrices given by Rudolf and Woeginger 
(1995), because we are allowing submodular functions to take infinite values.) Using this 
decomposition result, we will show that the set of unary and binary submodular functions 
is a tractable soft constraint language. 

To obtain our decomposition result, we use the following technical lemma. 

Lemma 4.4 Let D be a totally ordered set and let (p : D^ -^ E be a binary submodular 
function. For any a,b,c & D such that a < b < c, if there exists e & D with (p{e, b) = 0, 
then for all x ^ D we have (j){x, b) < inax((f)(x, a), (j){x, c)) . 

Proof: Assume that (j){e, b) = 0. 

• If a; > e then, by the submodularity of c/), we have (^{x, b) < (j){x, b) + (f){e, a) < 
(p{x, a) + (/)(e, b) = (p{x, a) 

• li X < e then, by the submodularity of (p, we have (p{x, b) < (j){x, b) + (/^(e, c) < 

4>{e, b) + 4>{x, c) = 4>{x, c). 

• li e = X then (p{x, b) = 0. 

Hence, in all cases the result holds. D 

Lemma 4.5 Let D be a totally ordered finite set. A binary function, (p : D^ -^ E is 
submodular if and only if it can be expressed as a sum of generalized interval functions on 
D. Furthermore, a decomposition of this form can be obtained in 0{\D\^) time. 

Proof: By the observations already made, any function cf) which is equal to a sum of 
generalized interval functions is clearly submodular. 

To establish the converse, we use induction on the tightness of </), denoted t((/)), that is, 
the number of pairs for which the value of (j) is non-zero. 

Assume that (^ is a binary submodular function. If t((/)) = 0, then (;6 is identically zero, 
so the result holds trivially. Otherwise, by induction, we shall assume that the result holds 
for all binary submodular functions that have a lower tightness. 

To simplify the notation, we shall assume that £' = {1,2,..., M}, with the usual order- 
ing. 
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We will say that a value a G Z? is inconsistent if, for all y ^ D, (f)(a, y) = oo. If every 
a G Z? is inconsistent, then all values of (p are oo, so it is equal to the generalized interval 
function ??m°^i, and the result holds. Otherwise, if there exists at least one inconsistent 
value, then we can find a pair of values a,b ^ D, with |a — 6| = 1, such that a is inconsistent 
and b is not inconsistent. 

Now define the function cp' as follows: 

9(x,y)-<^ r/;(6,y) if a; = a 

It is straightforward to check that cp' is submodular and <p{x,y) = (p'{x,y) + r]'^^{x,x). 
Since t((//) < t(('/)), it now suffices to show that the result holds for (/)'. 

By repeating this procedure we may assume that (f) has no inconsistent values, and 
by symmetry, that the reversed function (f)^ , defined by (f^{x,y) = (f){y,x), also has no 
inconsistent values. 

We will say that a value a G .D is penalized if, for all y ^ D, (j){a, y) > 0. If a is penalized, 
then we set jia = '^va.{(j){a,y)\y G D}. li Ha = oo, then a is inconsistent, so we may assume 
that Ha < oo, and define a new function cp' as follows: 

^^ '^^ \ (j){x,y)-iia iix = a. 

Again it is straightforward to check that </)' is submodular and (pix^y) = (l)'{x,y)+r]i^^^Ax,x). 
Since T{(p') < T{(p), it now suffices to show that the result holds for cp' . 

By repeating this procedure we may assume that neither (p nor (p'^ has any inconsistent 
or penalized values. 

Now if, for all a,b ^ D, we have (p{a,M) = (l){M^b) = 0, then, by submodularity, for 
all a,6, G D, <p{a,b) = <p{a,b) + <p{M,M) < <p{a,M) + <p{M,b) = 0, so </) is identically 0, 
and the result holds trivially. Otherwise, by symmetry, we can choose a to be the largest 
value in D such that (?!)(a, M) / 0. Since a is not penalized, we can then choose r to be the 
largest value in D such that (j){a, r) = 0. By the choice of a, we know that r < M, and so 
we can define b = r + 1. This situation is illustrated in Figure 3. 

For any x,y & D such that x < a and y > b, we have: 

(j){x,y) = (l>{x,y)+(l>{a,r) {0(a,r)=O) 

> <p{x,r) + <p{a,y) (submodularity) 
= <f>{3:,r) +m&x{(f>{a,y),<f){a,r)) {(f>{a,r)=0) 

> (l){x,r) + (j){a,b) (Lemma 4.4) 

> Ha,b) 

Hence we can now define a function cp' as follows: 

{cl){x, y) iix>a\/y<b 

iix = aAy = b 

(l){x, y) — (l){a, b) otherwise. 

It is straightforward to check that (p{x^y) = (f)'{x,y) + vfba] iv^^)- Since t(^') < t(^), it 
only remains to show that cp' is submodular, and then the result follows by induction. In 
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M M 




M M 



1 1 




Figure 3: (a) The choice of a and b in the proof of Theorem 4.5. Dotted lines repre- 
sent known values of (p. Solid lines represent values of (p known not to be 0. 
(b-d) Representations of the three cases for the choice of u, v, x, y. The filled area 
represents the non-zero values of the generalized interval constraint subtracted 
from (j) to obtain cf)' . 



other words, it suffices to show that for any u,v,x,y G D such that u < x and u < y, we 
have: 

(//{u, v) + (//{x, y) < (//{u, y) + (//(a;, v) (2) 

Replacing x with u in the inequality derived above, we have that whenever u < a and y >h, 



(t>{u,y) > cl){u,r) + (/;(a,6). 



(3) 



The proof of inequality (2) may be divided into four cases, depending on the values of (p{a, b) 
and the choice of u, v, x, y: 

1. (I){a, b) = oo 

In this case, cp' differs from (p only on the pair (a, b) (because oo — oo = oo). Since (p is 
submodular, inequality (2) can only fail to hold if either {x,v) or {u,y) equals (a, 6). 

If {x,v) = {a,b), then, using inequality (3), we know that (l){u,y) = oo, so (pi'{u,y) = 
oo — oo = oo, and inequality (2) holds. 

If {u, y) = (a, b) then we have, for all a; > f/ and y > v, 

4)'{u, v) + 4)'{x, y) = (/){u, v) + (/){x, y) 

< (/)(?/, v) + uiax{4>{x, r), (l){x, M)) (by Lemma 4.4) 

= (p{u, v) + (p{x, r) [x > a ^ (p{x, M) = 0) 

< (f){u, r) + (j){x, v) (by submodularity) 
= <p{x,v) (since (l){u,r) = 0) 
= (f>'{x,v) 

< cl)'{u,y) + (P'{x,v) 

so inequality (2) holds. 
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2. a < u < X OT V < y < b; (see Figure 3 part (b)) 

In this situation we know that inequality (2) holds because (p and cp' are identical for 
these arguments. 

3. u < X < a ov b < V < y; (see Figure 3 part (c)) 
li u < X < a, then we have: 

(l)'{u,v) = (j){u,v) — p 

(j)'{x,v) = (l){x,v) — p 

(f)'{u,y) = (l){u,y) - p' 

(j)'{x,y) = 4){x,y)-p' 

where p and p' are either or (p{a,b), depending on whether v ov y are less than b. 
Inequality (2) follows trivially by cancelling p or p' or both. 

An exactly similar argument holds ii b < v < y. 

4. u < a < X and v < b < y; (see Figure 3 part (d)) 

liu < a, than by inequality (3) we have (p{u,y) — (j){a, b) > (j){u,r), so (j)'{u,y) > (j){u,r). 
Moreover, ii u = a, then 4>{u,r) = 0, so again (f)'(u,y) > 4>{u,r). Hence, 

<p'{u, v) + (f>'{x, y) = (l){u, v) + (l){x, y) 

< (p{u,v) + m.ax{(f){x,r),(l){x, M)) 
= (f)(u,v) + (l){x,r) 



(by Lemma 4.4) 
{x> a^ <p{x,M) = 0) 
(by submodularity) 
(since (f)'(u,y) > (f)(u,r)) 



b' is submodular, and the result follows by 



< (l){u,r) + (t){x,v) 

< (f)'{u,y) + (f){x,v) 

< (l)'{u, y) + (//(a;, v) 

so again inequality (2) holds. 

Hence, in all cases inequality (2) holds, so 
induction. 

The number of generalized interval functions in the decomposition of a binary submod- 
ular function can grow quadratically with \D\ (see Example 4.6 below) and the cost of 
subtracting one binary submodular function from another is also quadratic in \D\. Hence 
a naive algorithm to obtain such a decomposition by calculating the required generalized 
interval functions and subtracting off each one in turn from the original function will take 
OdZ?!^) time. However, by taking advantage of the simple structure of generalized interval 
functions, it is possible to obtain a suitable decomposition in OdZ?!^) time; a possible algo- 
rithm is given in Figure 4. The correctness of this algorithm follows directly from the proof 
of the decomposition result given above. D 



Example 4.6 Consider the binary function ttm on Z? = {1, 2, . . . , M}, defined in Exam- 
ple 4.3. When M = 3, the values of tts are given by the following table: 



TTS 


1 


2 


3 


1 


8 


7 


6 


2 


7 


5 


3 


3 


6 


3 
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Input: A binary submodular function on the set {1,2,. . . ,M} 

such that neither (p nor (fy^ has any inconsistent or penaUzed values 

Output: A set of generaUzed interval functions {(j>i,(p2, ■ ■ ■ , (t>q} 
such that (f){x,y) = J]Li M^^v) 

Algorithm: 

for j = 1 to M, T[j] = % Initialise list of values to be subtracted 

for i = M downto 1 % For each row. . . 

while 4>{i, M) > T[M] do % If 4>{i, M) not yet zero. . . 

j = M; while 4>{i,j) > T[j] do j = j — 1 % Find maximal zero position in row i 

A = (t>{i,j + 1) — T[j + 1] % Set new value to be subtracted 

output rjy,^ i]{yTx) % Output generalized interval function 
for fc = j + I'to M, T[k] = T[k] + A % Update list of values to be subtracted 

for j = 1 to M, 4>{i,j) = 4>{i,j) — T[j] % Subtract values from this row 

for i = 1 to M, T[j] = % Initialise list of values to be subtracted 

for j = M downto 1 % For each column. . . 

while 4>iM,j) > T[M] do % If 4>iM,j) not yet zero. . . 

i = M; while 4>{i,j) > T[i] do i = i — 1 % Find maximal zero position in column j 

A = (f>{i + l,j) — T[i + 1] % Set new value to be subtracted 

output ??uj_i ,1 (a;, y) % Output generalized interval function 

for fc = i + l'to M, T[k]=T[k] + A % Update list of values to be subtracted 

for « = 1 to M, 4>{i,j) = 4>{i,j) — T[i] % Subtract values from this column 

Figure 4: A decomposition algorithm with time complexity OdZ?!^) 
Note that: 




Hence, 

+ ^[^,2] (2^' y) + ^[3,2] (^' y) + ^[Vi] (^' y^ + ^[3,1] (^' y^- 

In general, for arbitrary values of M, we have 

M-l / M-1 > 

d=l \ e=l / 
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We remark that this decomposition is not unique - other decompositions exist, including 
the symmetric decomposition TrM{x,y) = ir'^{x,y) + ir'^{y,x), where 

D 

Combining Lemma 4.5 with Corollary 3.6, gives: 

Theorem 4.7 For any finite soft constraint language T on a finite totally ordered set D, if 
r contains only unary or binary submodular functions, then the time complexity o/sCSP(r) 
is 0{n^\D\^). 

The next result shows that the tractable class identified in Theorem 4.7 is maximal. 

Theorem 4.8 Let T be the set of all binary submodular functions on a totally ordered finite 
set D, with \D\ > 2. For any binary function ip ^ T, sCSP(r U {V'}) is NP-hard. 

Proof: We shall give a reduction from sCSV{{^xor\) to sCSP(r U {t/)}), where 4>xoR 
is the binary function defined in Example 2.6. It was pointed out in Example 2.6 that 
sCSP{{(Pxor}) corresponds to the Max-Sat problem for the exclusive-or predicate, which 
is known to be NP-hard (Creignou et al., 2001). Hence sCSP(r U {ip}) is also NP-hard. 

To simplify the notation, we shall assume that £' = {1,2,..., M}, with the usual order- 
ing. 

Since ip is not submodular, there exist a,b,c,d G D such that a < b and c < d but 
ipia, c) + 'ip{b, d) > '(J;{a, d) + '0(6, c). 

Choose an arbitrary evaluation e such that < e < oo, and define A and n as follows: 

A = min('0(a, c),'(/;(a, (i) + -0(6, c) + e) 
H = m.in{ip{b,d),ip{a,d) + 'ij^{b,c?j + e) 

It is straightforward to check that 

i/j{a, d) + '(/;(6, c) < A + /i < oo. (4) 

Now define a binary function Q as follows: 

/i if (2;,y) = (l,a) 
C(a;,y) = <( A if (:r, y) = (2, 6) 
oo otherwise 



and a binary function ^ as follows: 



H^,y) 



if{x,y) = {c,l) 

i^ia,d) + l if(rE,y) = (c,2) 

i^{b,c) + l ii(x,y) = (d,l) 

ii(x,y) = id,2) 

oo otherwise 
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C{x,t 



X 




(t)[v,x) 



ip(t,u 



ip{w,v) 




Ciy^w) 



Figure 5: An instance of sCSP(r U {t/;}) used to construct a specific soft constraint between 
variables x and y. 



It is straightforward to check that both (^ and (p are submodular. 

Now consider the instance Vq of sCSP(r U {ip}) illustrated in Figure 5. It is simple but 
tedious to verify that the combined effect of the six soft constraints shown in Figure 5 on 
the variables x and y is equivalent to imposing a soft constraint on these variables with 
evaluation function x, defined as follows: 



x + n + x + n 

X[x, y) = ■{ A + /i + ■t/;{a, d) + '(/;(6, c) 
oo 



if a;, y G {1, 2} and x = y 
if a;, y G {1, 2} and x ^ y 
otherwise 



Note that, by inequality (4), we have \ + ijl + ipia, d) + ip{b, c) < X + /j, + X + /j, < oo. 

Now let V be any instance of sCSP({(/)xoii})- If we replace each constraint {{a;, y), 4'xor) 
in V with the set of constraints shown in Figure 5 (introducing fresh variables i, u, v, w each 
time) then we obtain an instance V' of sCSP(r U {ip})- It is straightforward to check that 
V' has a solution involving only the values 1 and 2, and that such solutions correspond 
exactly to the solutions of "P, so this construction gives a polynomial-time reduction from 
sCSP{{cf)xoR}) to sCSP(r U {ip}), as required. D 



5. Applications 

In this section we give a number of examples to illustrate the wide range of soft constraints 
which can be shown to be tractable using the results obtained in the previous sections. 
First we define a standard way to associate a function with a given relation. 
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Definition 5.1 For any k-ary relation R on a set D, we define an associated function, 

(pR-. D'^ ^ E, as follows: 



t'R{xi,X2,... ,Xk 






if {xi,X2,...,Xk) € R 
oo otherwise. 



By Theorem 4.7, any collection of crisp constraints, where each constraint is specified by a 
relation R for which ^r is unary or binary submodular, can be solved in cubic time, even 
when combined with other soft constraints that are also unary or binary submodular. 

Example 5.2 The constraint programming language CHIP incorporates a number of con- 
straint solving techniques for arithmetic and other constraints. In particular, it provides 
a constraint solver for a restricted class of crisp constraints over natural numbers, referred 
to as basic constraints (van Hentenryck et al., 1992). These basic constraints are of two 
kinds, which are referred to as "domain constraints" and "arithmetic constraints". The 
domain constraints described by van Hentenryck et al. (1992) are unary constraints which 
restrict the value of a variable to some specified finite subset of the natural numbers. The 
arithmetic constraints described by van Hentenryck et al. (1992) have one of the following 

forms: 

aX ^b aX <hY + c 

aX = hY + c aX>hY + c 

where variables are represented by upper-case letters, and constants by lower case letters, 
all constants are non-negative real numbers and a is non-zero. 

For each of these crisp constraints the associated function given by Definition 5.1 is 
unary or binary submodular, hence, by Corollary 3.6, any problem involving constraints of 
this form can be solved in cubic time. Moreover, any other soft constraints with unary or 
binary submodular evaluation functions can be added to such problems without sacrificing 
tractability (including the examples below). D 

Now assume, for simplicity, that -D = {1, 2, . . . , M}. 

Example 5.3 Consider the binary linear function A defined by A(a;, y) = ax + by + c, where 
a,6GM+. 

This function is submodular and hence, by Corollary 3.6, any collection of such binary 
linear soft constraints over the discrete set D can be solved in cubic time. D 

Example 5.4 The Euclidean length function yaJ^ + y^ jg submodular, and can be used to 
express the constraint that a 2-dimensional point (a;, y) is "as close to the origin as possible". 

D 

Example 5.5 The following functions are all submodular: 

• Sr(x,y) = \x — yl*", where r G M, r > 1. 

The function 5r can be used to express the constraint that: "The values assigned to 
the variables x and y should be as similar as possible" . 

• 5^{x,y) = (max(a; — y, 0))'', where r G M, r > 1. 

The function 5^ can be used to express the constraint that: "The value of x is either 
less than or as near as possible to y" . 
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' ^' , ~ . where r G M, r > 1. 

oo otherwise 

The function 5^ can be used to express the temporal constraint that: "a; occurs as 
soon as possible after y" . 

D 

Example 5.6 Reconsider the optimization problem defined in Example 1.1. Since ijji is 
unary, and 5^ is binary submodular (Example 5.5), this problem can be solved in cubic 
time, using the methods developed in this paper. 

Let V be the instance with n = 3 and r = 2. The values of 82 are given by the following 
table: 



82 


1 


2 


3 


1 





1 


4 


2 


1 





1 


3 


4 


1 






Hence, 

+ ??[V2] (a^' y) + ^[2,1] (2^' 2/) + ^^3,1] (a^' y) 

Using this decomposition for 82-, we can construct the graph G-p corresponding to the 
instance V, as shown in Figure 6. 

The minimum weight of any cut in this graph is ^, and hence the optimal evaluation 
of any assignment for P is ^ . 

One of the several possible cuts with this weight is indicated by the gray line across the 
graph, which corresponds to the solution t;i = 1, t;2 = 1) ^3 = 2, 1)4 = 2, 1)5 = 3, t^g = 3. D 

Note that some of the submodular functions defined in this section may appear to be 
similar to the soft simple temporal constraints with semi-convex cost functions defined and 
shown to be tractable by Khatib et al. (2001). However, there are fundamental differences: 
the constraints described by Khatib et al. (2001) are defined over an infinite set of values, 
and their tractability depends crucially on the aggregation operation used for the costs 
being idempotent (i.e., the operation min). In this paper we are considering soft constraints 
over finite sets of values, and an aggregation operation which is strictly monotonic (e.g., 
addition of real numbers), so our results cannot be directly compared with those in the 
paper by Khatib et al. (2001). 

6. Conclusion 

As we have shown with a number of examples, the problem of identifying an optimal as- 
signment for an arbitrary collection of soft constraints is generally NP-hard. However, by 
making use of the notion of submodularity, we have identified a large and expressive class 
of soft constraints for which this problem is tractable. In particular, we have shown that 
binary soft constraints with the property of submodularity can be solved in cubic time. By 
making use of this result, it should be possible to extend the range of optimisation problems 
that can be effectively solved using constraint programming. 
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Figure 6: The graph G-p associated with the instance V defined in Example 5.6. 

Prom a theoretical perspective, this paper gives the first complete characterisation of a 
tractable class of soft constraints over a finite set of values with more than two elements. We 
are confident that the methods developed here can be extended to identify other tractable 
cases, and hence to begin a systematic investigation of the computational complexity of soft 
constraint satisfaction. A first step in this direction has been taken by Cohen et al. (2003). 

We believe that this work illustrates once again the benefit of interaction between re- 
search on constraint satisfaction and more traditional research on discrete optimization 
and mathematical programming: the notion of submodularity comes from mathematical 
programming, but the idea of modelling problems with binary constraints over arbitrary 
finite domains comes from constraint programming. By combining these ideas, we obtain a 
flexible and powerful modelling language with a provably efficient solution strategy. 
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